Sizing Applications for Solaris - Part 2
By Adrian Cockcroft, Staff Engineer, SMCC Product Marketing
This article is the second in a two-part series focusing on how to correctly size applications for the Solaris 2 operating environment, and how to avoid some of the
most common problems and pitfalls inherent in the sizing process. Part I, which
appeared in the Spring 1995 issue of Catalyst Flash, outlined how to choose the correct
workload for your application. Part II discusses how to collect and evaluate test results
given a particular workload. The information in this series is summarized from the
recently published book, Sun Performance and Tuning: SPARC and Solaris, and the
1993 whitepaper, Sun Performance Tuning Overview White Paper, both by Adrian
Cockcroft.
In general, when you tune an application, you focus on measuring the
application processes and adjusting them to reduce their resource
consumption. However, when you size an application, you focus, instead, on
the system as a whole as you work out what type of processor, I/O and
memory configurations your application requires for a range of common end-
user workloads. Unfortunately, application sizing is often an afterthought, or is
not performed at all. However, if you plan from the start, you can avoid some
common pitfalls. This article should help to point you in the right direction.
Measuring the Workload on a System
At this point, you have decided which tests you wish to run on your system.
Now, you need to know what measurements to make, and what to look for in
those measurements. In the remainder of this article, I will outline how to set
up data collection. I'll also cover the most common sizing issues, so you should
be able to tell when the system is short of disks, network bandwidth, memory,
or CPU power.
Using Accounting to Monitor the Workload
If you have access to a group of real end-users over a long period of time, then
enable the UNIX system accounting logs. This can be useful on a network of
workstations as well as on a single time-shared server. From this you can
identify how often programs run, how much CPU time, disk I/O, and memory
each program uses, and what work patterns look like throughout the week. To
enable accounting, enter the three commands shown at the start of Figure 1.
Also refer to the section "Administering Security, Performance, and
Accounting in Solaris 2" in the Solaris System Administration Answerbook and see
the acctcom command. You must also add crontab entries to summarize and
checkpoint the accounting logs. Collecting and checkpointing the accounting
data puts a negligible additional load onto the system, but the summary scripts
that run once a day or once a week can have a noticeable effect, so you should
schedule them to run after hours.
Figure 1 How to Start System Accounting in Solaris 2
# ln /etc/init.d/acct /etc/rc0.d/K22acct
# ln /etc/init.d/acct /etc/rc2.d/S22acct
# /etc/init.d/acct start
Starting process accounting
# crontab -l adm
#ident "@(#)adm 1.5 92/07/14 SMI" /* SVr4.0 1.2 */
#min hour day month weekday
0 * * * * /usr/lib/acct/ckpacct
30 2 * * * /usr/lib/acct/runacct 2> \
/var/adm/acct/nite/fd2log
30 9 * * 5 /usr/lib/acct/monacct
Collecting Long-term System Utilization Data
As a matter of course, collect overall system utilization data on all the
machines you deal with. This facility already exists for Solaris 2; simply
uncomment the entry for the sys user in the crontab file.
The Solaris 2 utilization log consists of a sar binary log file taken at 20-minute
intervals throughout the day and saved in /var/adm/sa/saXX, where XX is
the day of the month. This collects a utilization profile for an entire month. You
should save the monthly records for future comparisons. When a performance-
related problem occurs, it is far easier to identify the source of the problem if
you have measurements from a time when the problem was not present.
Remember that the real-life user workload is likely to increase over time. You
should try to produce a plot to look for long-term utilization trends.
An example crontab file is shown in Figure 2. Note that sar does not collect
network-related information.
Figure 2 crontab Entry for Long-term sar Data Collection
# crontab -l sys
#ident "@(#)sys 1.5 92/07/14 SMI" /* SVr4.0 1.2 */
#
# The sys crontab should be used to do performance collection. See cron
# and performance manual pages for details on startup.
#
0 * * * 0-6 /usr/lib/sa/sa1
20,40 8-17 * * 1-5 /usr/lib/sa/sa1
5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A
Interpreting the Measurements
Once you have measured the performance of your application and the
utilization of the system on which it is running, you have a problem. The
numbers reported by commands like vmstat, iostat, netstat and are
mostly useful for debugging and tuning the Solaris kernel itself. They are
poorly documented, the meaning and expected behavior of each metric is not
clear, and the behavior changes without notice between Solaris releases. If you
have UNIX systems from other vendors, you will find that there is minimal
consistency between UNIX implementations.
Just because two different systems print a column of numbers from vmstat
with the same heading, you cannot assume that the same virtual memory
algorithm and parameters underlies those measurements. There is some work
underway to produce an X/Open standard for performance measurements, but
current standards only define the interface. The implementation and performance
of a standardized interface can vary.
I have tried to document the meaning of most of these measurements for both
Solaris 1 and Solaris 2 in my 1993 white paper, Sun Performance Tuning
Overview White Paper and my recent book, Sun Performance and Tuning: SPARC
and Solaris. Within the limited space of this article, I will try to provide enough
information to help you determine where overloading may have occurred:
disks, network, available RAM, or CPUs.
The system will often have a disk bottleneck.
In many cases the most serious bottleneck is an overloaded or slow disk. Use
iostat -x 30 to look for disks that are more than 30 percent busy and have
service times of more than 50 ms. The service time is key; this is the time
between a user process issuing a read and the read completing (for example),
so it is often in the critical path for response time. If many other processes are
also accessing that disk, a queue can form, and service times of over 1000 ms
(not a misprint, over one second!) can easily occur as you wait to get to the
front of the queue. A service time of 15 to 20ms on active disks is healthy.
After first pass tuning, the system will still have a disk bottleneck!
Keep checking iostat -x 30 as tuning progresses. When a bottleneck is
removed, the system may start to run faster, and as more work is done, some
other disk will overload. At some point you may need to stripe filesystems and
tablespaces over multiple disks.
Poor NFS response times may be hard to see.
Waiting for a network-mounted filesystem to respond is not counted in the
same way as waiting for a local disk. The system will appear to be idle when it
is really in a network I/O wait state. Use nfsstat -m to find out which NFS«
server is likely to be the problem, and tune its disks. You should look at the
NFS operation mix with nfsstat on both the client and server and, if writes
are common or the server's disk is too busy, configure a PrestoServe or
NVSIMM non-volatile write cache in the server. See that you do not overload
the Ethernet by checking that the collision rate is low. If collisions are above
five percent, try to split up the network or replace it with something faster. For
more information, see SMCC NFS Server Performance and Tuning Guide on the
Solaris 2.4 SMCC Hardware AnswerBook« CD.
Avoid the common memory usage misconceptions
When you look at vmstat free, please don't waste time worrying about
where all the RAM has gone. After a while, the free list will stabilize around
one sixteenth of the total memory configured, or at 1MB or less, depending
upon the Solaris release and kernel architecture in use. The system stops
bothering to reclaim memory from the file cache above this level, even when
you aren't running anything. This is normal behaviour.
Don't panic when you see page-ins and page-outs in vmstat.
These are normal since all filesystem I/O is done using the paging process.
Hundreds or thousands of Kbytes paged in and paged out are not a cause for
concern, just a sign that the system is working hard.
Use page scanner activity as your RAM shortage indicator.
When you really are short of memory, the scanner will run continuously at a
high rate (over 200 pages/second averaged over 30 seconds). If it runs in
separated high-level bursts, you should try patching slowscan to 100 so that the
bursts become longer and slower.
Look for a long run queue (vmstat procs r).
If, in a multiuser system, the run queue or load average is more than four times
the number of CPUs, then processes wait too long for a slice of CPU time. This
waiting can increase the interactive response time seen by users. Add more or
faster CPUs to the system.
Look for processes blocked waiting for I/O (vmstat procs b).
This is a sign of a disk bottleneck. If the number of processes blocked
approaches or exceeds the number in the run queue, check the disks. If you are
running database batch jobs, it is OK to have some blocked processes, but you
can increase batch throughput by removing disk bottlenecks.
Check for CPU system time dominating user time.
If there is more system time than user time and the machine is not an NFS
server, you may have a problem. To find out the source of system calls, use the
truss command. To look for high interrupt rates and excessive mutex
contention, use the mpstat command. If the smtx column on a multiprocessor
system is more than 200 per CPU, and there is a coincident increase in system
CPU time for that CPU, then you have probably reached the limit of MP
scalability for that workload/OS release combination. A subsequent OS release
is likely to scale better.
Additional Information
For additional information, you can refer to the following publications:
- SMCC NFS Server Performance and Tuning Guide
This is on the Solaris 2.4 SMCC Hardware 11/94 CD in the hardware
specific AnswerBook« manual.
- Sun Performance and Tuning: SPARC and Solaris, Adrian Cockcroft, SunSoft
Press/Prentice Hall, January 1995. ISBN 0-13-149642-3.
For more information try http://www.sun.com/smi/ssoftpress/index.html.
- Sun Performance Tuning Overview White Paper, Adrian Cockcroft, December
1993.
This paper is available for ftp from most of the SunSite servers. Look for the
file SunPerfOvDec93.ps.Z A Japanese translation was published as volume
10 of the Nihon Sun "Expert" series.
4. The Art of Computer Systems Performance Analysis, Raj Jain, 1992, Wiley, ISBN
0-471-50336-3.